Overview

Dataset statistics

Number of variables34
Number of observations95221
Missing cells28958
Missing cells (%)0.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.7 MiB
Average record size in memory272.0 B

Variable types

NUM19
BOOL11
CAT4

Warnings

AppointmentDTS has a high cardinality: 320 distinct values High cardinality
AppointmentMonthNBR is highly correlated with AppointmentIDHigh correlation
AppointmentID is highly correlated with AppointmentMonthNBRHigh correlation
AppttoCheckoutNBR is highly correlated with CheckintoCheckoutNBRHigh correlation
CheckintoCheckoutNBR is highly correlated with AppttoCheckoutNBRHigh correlation
Noshow24NBR has 1616 (1.7%) missing values Missing
CancellationsNBR has 1616 (1.7%) missing values Missing
Latearrivals24NBR has 1616 (1.7%) missing values Missing
CheckintoCheckoutNBR has 6765 (7.1%) missing values Missing
AppttoCheckoutNBR has 6765 (7.1%) missing values Missing
CheckintoApptNBR has 4116 (4.3%) missing values Missing
Arrived24NBR has 1616 (1.7%) missing values Missing
Providers24CNT has 1616 (1.7%) missing values Missing
ThatProvider24NBR has 1616 (1.7%) missing values Missing
NoshowRate24NBR has 1616 (1.7%) missing values Missing
EdVisitsNBR is highly skewed (γ1 = 30.76642438) Skewed
AppointmentID has unique values Unique
AgeNBR has 1632 (1.7%) zeros Zeros
ApptLagNBR has 22713 (23.9%) zeros Zeros
Noshow24NBR has 56215 (59.0%) zeros Zeros
CancellationsNBR has 22949 (24.1%) zeros Zeros
Latearrivals24NBR has 21275 (22.3%) zeros Zeros
CheckintoApptNBR has 5038 (5.3%) zeros Zeros
Arrived24NBR has 2506 (2.6%) zeros Zeros
Providers24CNT has 2506 (2.6%) zeros Zeros
ThatProvider24NBR has 24825 (26.1%) zeros Zeros
NoshowRate24NBR has 56215 (59.0%) zeros Zeros
EdVisitsNBR has 67942 (71.4%) zeros Zeros
IpVisitsNBR has 93510 (98.2%) zeros Zeros

Reproduction

Analysis started2020-09-12 16:17:54.922672
Analysis finished2020-09-12 16:19:14.953589
Duration1 minute and 20.03 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

AppointmentID
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct95221
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47611
Minimum1
Maximum95221
Zeros0
Zeros (%)0.0%
Memory size743.9 KiB

Quantile statistics

Minimum1
5-th percentile4762
Q123806
median47611
Q371416
95-th percentile90460
Maximum95221
Range95220
Interquartile range (IQR)47610

Descriptive statistics

Standard deviation27488.07933
Coefficient of variation (CV)0.5773472376
Kurtosis-1.2
Mean47611
Median Absolute Deviation (MAD)23805
Skewness4.045179012e-18
Sum4533567031
Variance755594505.2
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20471< 0.1%
 
231981< 0.1%
 
6611< 0.1%
 
68061< 0.1%
 
47591< 0.1%
 
272881< 0.1%
 
252411< 0.1%
 
313861< 0.1%
 
293391< 0.1%
 
191001< 0.1%
 
Other values (95211)95211> 99.9%
 
ValueCountFrequency (%) 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
952211< 0.1%
 
952201< 0.1%
 
952191< 0.1%
 
952181< 0.1%
 
952171< 0.1%
 

PatientID
Real number (ℝ≥0)

Distinct33473
Distinct (%)35.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16752.13458
Minimum1
Maximum33473
Zeros0
Zeros (%)0.0%
Memory size743.9 KiB

Quantile statistics

Minimum1
5-th percentile1869
Q18386
median16640
Q325323
95-th percentile31719
Maximum33473
Range33472
Interquartile range (IQR)16937

Descriptive statistics

Standard deviation9649.346171
Coefficient of variation (CV)0.5760069634
Kurtosis-1.220440449
Mean16752.13458
Median Absolute Deviation (MAD)8443
Skewness0.03000413431
Sum1595155007
Variance93109881.53
MonotocityIncreasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
663560.1%
 
11962530.1%
 
915232< 0.1%
 
1430632< 0.1%
 
1682128< 0.1%
 
978128< 0.1%
 
509227< 0.1%
 
1253426< 0.1%
 
1163725< 0.1%
 
926625< 0.1%
 
Other values (33463)9488999.7%
 
ValueCountFrequency (%) 
11< 0.1%
 
24< 0.1%
 
32< 0.1%
 
41< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
334732< 0.1%
 
334723< 0.1%
 
334711< 0.1%
 
334701< 0.1%
 
334694< 0.1%
 

ClinicNM
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
B
29990 
A
26370 
E
15084 
C
14480 
D
9297 
ValueCountFrequency (%) 
B2999031.5%
 
A2637027.7%
 
E1508415.8%
 
C1448015.2%
 
D92979.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

AppointmentDTS
Categorical

HIGH CARDINALITY

Distinct320
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
5/24/18
 
488
3/8/18
 
476
5/8/18
 
474
5/3/18
 
472
3/20/18
 
471
Other values (315)
92840 
ValueCountFrequency (%) 
5/24/184880.5%
 
3/8/184760.5%
 
5/8/184740.5%
 
5/3/184720.5%
 
3/20/184710.5%
 
12/19/184710.5%
 
2/20/184680.5%
 
11/14/184680.5%
 
5/29/184670.5%
 
3/1/184660.5%
 
Other values (310)9050095.0%
 
Frequencies of value counts

Unique

Unique20 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length8
Median length7
Mean length6.977736004
Min length6

AppointmentMonthNBR
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.561451781
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size743.9 KiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median6
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.344395106
Coefficient of variation (CV)0.5097035257
Kurtosis-1.207100685
Mean6.561451781
Median Absolute Deviation (MAD)3
Skewness0.0265449042
Sum624788
Variance11.18497863
MonotocityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
590819.5%
 
390549.5%
 
886879.1%
 
1084818.9%
 
484028.8%
 
282418.7%
 
1182128.6%
 
678628.3%
 
975457.9%
 
772937.7%
 
Other values (2)1236313.0%
 
ValueCountFrequency (%) 
151815.4%
 
282418.7%
 
390549.5%
 
484028.8%
 
590819.5%
 
ValueCountFrequency (%) 
1271827.5%
 
1182128.6%
 
1084818.9%
 
975457.9%
 
886879.1%
 

AppointmentWeekdayNBR
Real number (ℝ≥0)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.988101364
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size743.9 KiB

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.391332725
Coefficient of variation (CV)0.3488709533
Kurtosis-1.207545006
Mean3.988101364
Median Absolute Deviation (MAD)1
Skewness0.02828131126
Sum379751
Variance1.935806751
MonotocityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
41998321.0%
 
51993620.9%
 
31971120.7%
 
21819619.1%
 
61692917.8%
 
74290.5%
 
137< 0.1%
 
ValueCountFrequency (%) 
137< 0.1%
 
21819619.1%
 
31971120.7%
 
41998321.0%
 
51993620.9%
 
ValueCountFrequency (%) 
74290.5%
 
61692917.8%
 
51993620.9%
 
41998321.0%
 
31971120.7%
 

AppointmentHourNBR
Real number (ℝ≥0)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.01135254
Minimum7
Maximum19
Zeros0
Zeros (%)0.0%
Memory size743.9 KiB

Quantile statistics

Minimum7
5-th percentile8
Q110
median11
Q314
95-th percentile16
Maximum19
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.769706516
Coefficient of variation (CV)0.2305907272
Kurtosis-1.208521922
Mean12.01135254
Median Absolute Deviation (MAD)2
Skewness0.08256835547
Sum1143733
Variance7.671274185
MonotocityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
101332314.0%
 
91325813.9%
 
141284113.5%
 
151234913.0%
 
111144412.0%
 
131057011.1%
 
1682268.6%
 
880638.5%
 
716031.7%
 
1713211.4%
 
Other values (3)22232.3%
 
ValueCountFrequency (%) 
716031.7%
 
880638.5%
 
91325813.9%
 
101332314.0%
 
111144412.0%
 
ValueCountFrequency (%) 
191700.2%
 
188690.9%
 
1713211.4%
 
1682268.6%
 
151234913.0%
 

AgeNBR
Real number (ℝ≥0)

ZEROS

Distinct106
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.53279214
Minimum0
Maximum118
Zeros1632
Zeros (%)1.7%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile3
Q127
median46
Q362
95-th percentile80
Maximum118
Range118
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.98413032
Coefficient of variation (CV)0.5161169828
Kurtosis-0.8123779807
Mean44.53279214
Median Absolute Deviation (MAD)17
Skewness-0.131375275
Sum4240457
Variance528.2702463
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
117611.8%
 
6116351.7%
 
016321.7%
 
5516101.7%
 
5716021.7%
 
5815841.7%
 
5615781.7%
 
6215671.6%
 
6315391.6%
 
5914951.6%
 
Other values (96)7921883.2%
 
ValueCountFrequency (%) 
016321.7%
 
117611.8%
 
211471.2%
 
36180.6%
 
45820.6%
 
ValueCountFrequency (%) 
1181< 0.1%
 
1104< 0.1%
 
1043< 0.1%
 
1031< 0.1%
 
1012< 0.1%
 

SexFLG
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
F
57070 
M
38151 
ValueCountFrequency (%) 
F5707059.9%
 
M3815140.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
82888 
1
12333 
ValueCountFrequency (%) 
08288887.0%
 
11233313.0%
 

SingleFLG
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
1
52719 
0
42502 
ValueCountFrequency (%) 
15271955.4%
 
04250244.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
76525 
1
18696 
ValueCountFrequency (%) 
07652580.4%
 
11869619.6%
 

EmailFLG
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
48513 
1
46708 
ValueCountFrequency (%) 
04851350.9%
 
14670849.1%
 

ApptLagNBR
Real number (ℝ≥0)

ZEROS

Distinct273
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.78240094
Minimum0
Maximum546
Zeros22713
Zeros (%)23.9%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median7
Q323
95-th percentile91
Maximum546
Range546
Interquartile range (IQR)22

Descriptive statistics

Standard deviation36.00486375
Coefficient of variation (CV)1.820045194
Kurtosis23.60839473
Mean19.78240094
Median Absolute Deviation (MAD)7
Skewness4.077167983
Sum1883700
Variance1296.350214
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02271323.9%
 
175507.9%
 
740384.2%
 
234913.7%
 
334393.6%
 
428863.0%
 
627942.9%
 
1426442.8%
 
824472.6%
 
524092.5%
 
Other values (263)4081042.9%
 
ValueCountFrequency (%) 
02271323.9%
 
175507.9%
 
234913.7%
 
334393.6%
 
428863.0%
 
ValueCountFrequency (%) 
5461< 0.1%
 
3991< 0.1%
 
3921< 0.1%
 
3841< 0.1%
 
3791< 0.1%
 

InsuranceDSC
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
Commercial
42768 
Medicare
22848 
Medicaid
21900 
Self Pay
7705 
ValueCountFrequency (%) 
Commercial4276844.9%
 
Medicare2284824.0%
 
Medicaid2190023.0%
 
Self Pay77058.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length10
Median length8
Mean length8.898289243
Min length8
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
61138 
1
34083 
ValueCountFrequency (%) 
06113864.2%
 
13408335.8%
 

AsthmaFLG
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
86255 
1
8966 
ValueCountFrequency (%) 
08625590.6%
 
189669.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
84832 
1
10389 
ValueCountFrequency (%) 
08483289.1%
 
11038910.9%
 

ObeseFLG
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
70552 
1
24669 
ValueCountFrequency (%) 
07055274.1%
 
12466925.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
94698 
1
 
523
ValueCountFrequency (%) 
09469899.5%
 
15230.5%
 

Noshow24NBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct38
Distinct (%)< 0.1%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean1.15180813
Minimum0
Maximum47
Zeros56215
Zeros (%)59.0%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum47
Range47
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.466546409
Coefficient of variation (CV)2.141455981
Kurtosis31.62482131
Mean1.15180813
Median Absolute Deviation (MAD)0
Skewness4.526508446
Sum107815
Variance6.083851187
MonotocityNot monotonic
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%) 
05621559.0%
 
11652517.4%
 
277968.2%
 
342964.5%
 
425202.6%
 
515791.7%
 
612151.3%
 
78500.9%
 
85600.6%
 
94040.4%
 
Other values (28)16451.7%
 
(Missing)16161.7%
 
ValueCountFrequency (%) 
05621559.0%
 
11652517.4%
 
277968.2%
 
342964.5%
 
425202.6%
 
ValueCountFrequency (%) 
473< 0.1%
 
451< 0.1%
 
431< 0.1%
 
341< 0.1%
 
333< 0.1%
 

CancellationsNBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct174
Distinct (%)0.2%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean4.9391165
Minimum0
Maximum604
Zeros22949
Zeros (%)24.1%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q35
95-th percentile18
Maximum604
Range604
Interquartile range (IQR)4

Descriptive statistics

Standard deviation11.0809564
Coefficient of variation (CV)2.243509827
Kurtosis535.2164223
Mean4.9391165
Median Absolute Deviation (MAD)2
Skewness14.99097015
Sum462326
Variance122.7875948
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02294924.1%
 
11662317.5%
 
21179412.4%
 
384738.9%
 
462416.6%
 
547805.0%
 
635223.7%
 
727592.9%
 
822642.4%
 
918582.0%
 
Other values (164)1234213.0%
 
(Missing)16161.7%
 
ValueCountFrequency (%) 
02294924.1%
 
11662317.5%
 
21179412.4%
 
384738.9%
 
462416.6%
 
ValueCountFrequency (%) 
6042< 0.1%
 
6032< 0.1%
 
5371< 0.1%
 
4251< 0.1%
 
3821< 0.1%
 

Latearrivals24NBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct101
Distinct (%)0.1%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean3.863960259
Minimum0
Maximum130
Zeros21275
Zeros (%)22.3%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q35
95-th percentile14
Maximum130
Range130
Interquartile range (IQR)4

Descriptive statistics

Standard deviation6.482212343
Coefficient of variation (CV)1.67760844
Kurtosis48.48044112
Mean3.863960259
Median Absolute Deviation (MAD)2
Skewness5.427877776
Sum361686
Variance42.01907685
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02127522.3%
 
11929720.3%
 
21308313.7%
 
393689.8%
 
468407.2%
 
548805.1%
 
637553.9%
 
727662.9%
 
820802.2%
 
915161.6%
 
Other values (91)87459.2%
 
(Missing)16161.7%
 
ValueCountFrequency (%) 
02127522.3%
 
11929720.3%
 
21308313.7%
 
393689.8%
 
468407.2%
 
ValueCountFrequency (%) 
1301< 0.1%
 
1291< 0.1%
 
1281< 0.1%
 
1221< 0.1%
 
1201< 0.1%
 

CheckintoCheckoutNBR
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct852
Distinct (%)1.0%
Missing6765
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean50.27244054
Minimum-911
Maximum596
Zeros231
Zeros (%)0.2%
Memory size743.9 KiB

Quantile statistics

Minimum-911
5-th percentile-63
Q139
median56
Q375
95-th percentile128
Maximum596
Range1507
Interquartile range (IQR)36

Descriptive statistics

Standard deviation69.44975345
Coefficient of variation (CV)1.381467713
Kurtosis16.50842071
Mean50.27244054
Median Absolute Deviation (MAD)18
Skewness-2.346794046
Sum4446899
Variance4823.268255
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5014641.5%
 
5714461.5%
 
5614171.5%
 
5314151.5%
 
5514101.5%
 
4814101.5%
 
4914051.5%
 
5413961.5%
 
4713951.5%
 
5813841.5%
 
Other values (842)7431478.0%
 
(Missing)67657.1%
 
ValueCountFrequency (%) 
-9113< 0.1%
 
-8401< 0.1%
 
-7493< 0.1%
 
-7471< 0.1%
 
-7081< 0.1%
 
ValueCountFrequency (%) 
5961< 0.1%
 
5882< 0.1%
 
5661< 0.1%
 
5481< 0.1%
 
5281< 0.1%
 

AppttoCheckoutNBR
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct940
Distinct (%)1.1%
Missing6765
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean42.3217871
Minimum-933
Maximum768
Zeros248
Zeros (%)0.3%
Memory size743.9 KiB

Quantile statistics

Minimum-933
5-th percentile-75
Q131
median48
Q367
95-th percentile123
Maximum768
Range1701
Interquartile range (IQR)36

Descriptive statistics

Standard deviation72.89260552
Coefficient of variation (CV)1.722342333
Kurtosis16.39418989
Mean42.3217871
Median Absolute Deviation (MAD)18
Skewness-1.940641188
Sum3743616
Variance5313.331939
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4414561.5%
 
5014451.5%
 
4614101.5%
 
4114051.5%
 
4313931.5%
 
4513641.4%
 
4213621.4%
 
3913611.4%
 
4813561.4%
 
3713531.4%
 
Other values (930)7455178.3%
 
(Missing)67657.1%
 
ValueCountFrequency (%) 
-9333< 0.1%
 
-9051< 0.1%
 
-8192< 0.1%
 
-7623< 0.1%
 
-7411< 0.1%
 
ValueCountFrequency (%) 
7681< 0.1%
 
6572< 0.1%
 
6451< 0.1%
 
5861< 0.1%
 
5781< 0.1%
 

CheckintoApptNBR
Real number (ℝ)

MISSING
ZEROS

Distinct406
Distinct (%)0.4%
Missing4116
Missing (%)4.3%
Infinite0
Infinite (%)0.0%
Mean6.913780802
Minimum-958
Maximum656
Zeros5038
Zeros (%)5.3%
Memory size743.9 KiB

Quantile statistics

Minimum-958
5-th percentile-9
Q12
median7
Q313
95-th percentile27
Maximum656
Range1614
Interquartile range (IQR)11

Descriptive statistics

Standard deviation21.2410614
Coefficient of variation (CV)3.072278687
Kurtosis284.162105
Mean6.913780802
Median Absolute Deviation (MAD)6
Skewness-10.72325705
Sum629880
Variance451.1826895
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
050385.3%
 
646744.9%
 
546104.8%
 
445844.8%
 
343934.6%
 
743034.5%
 
841574.4%
 
240934.3%
 
939524.2%
 
1037303.9%
 
Other values (396)4757150.0%
 
(Missing)41164.3%
 
ValueCountFrequency (%) 
-9581< 0.1%
 
-9511< 0.1%
 
-7271< 0.1%
 
-7161< 0.1%
 
-5281< 0.1%
 
ValueCountFrequency (%) 
6561< 0.1%
 
4331< 0.1%
 
2701< 0.1%
 
2442< 0.1%
 
2192< 0.1%
 

Arrived24NBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct190
Distinct (%)0.2%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean13.98278938
Minimum0
Maximum276
Zeros2506
Zeros (%)2.6%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile1
Q14
median9
Q318
95-th percentile45
Maximum276
Range276
Interquartile range (IQR)14

Descriptive statistics

Standard deviation16.27616093
Coefficient of variation (CV)1.164013881
Kurtosis21.75992182
Mean13.98278938
Median Absolute Deviation (MAD)6
Skewness3.406373969
Sum1308859
Variance264.9134147
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
360656.4%
 
460416.3%
 
558406.1%
 
256856.0%
 
652035.5%
 
151615.4%
 
746944.9%
 
844274.6%
 
940064.2%
 
1035273.7%
 
Other values (180)4295645.1%
 
ValueCountFrequency (%) 
025062.6%
 
151615.4%
 
256856.0%
 
360656.4%
 
460416.3%
 
ValueCountFrequency (%) 
2761< 0.1%
 
2712< 0.1%
 
2691< 0.1%
 
2671< 0.1%
 
2361< 0.1%
 

Providers24CNT
Real number (ℝ≥0)

MISSING
ZEROS

Distinct55
Distinct (%)0.1%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean6.281651621
Minimum0
Maximum57
Zeros2506
Zeros (%)2.6%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median5
Q38
95-th percentile18
Maximum57
Range57
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.572235754
Coefficient of variation (CV)0.8870653914
Kurtosis4.512995795
Mean6.281651621
Median Absolute Deviation (MAD)3
Skewness1.793217685
Sum587994
Variance31.0498113
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
21189212.5%
 
31096411.5%
 
11087611.4%
 
4960010.1%
 
581678.6%
 
665326.9%
 
753785.6%
 
843684.6%
 
936943.9%
 
1030353.2%
 
Other values (45)1909920.1%
 
ValueCountFrequency (%) 
025062.6%
 
11087611.4%
 
21189212.5%
 
31096411.5%
 
4960010.1%
 
ValueCountFrequency (%) 
571< 0.1%
 
561< 0.1%
 
535< 0.1%
 
522< 0.1%
 
516< 0.1%
 

ThatProvider24NBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct272
Distinct (%)0.3%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean15.92156402
Minimum0
Maximum383
Zeros24825
Zeros (%)26.1%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median6
Q321
95-th percentile64
Maximum383
Range383
Interquartile range (IQR)21

Descriptive statistics

Standard deviation25.22748264
Coefficient of variation (CV)1.584485206
Kurtosis20.84035885
Mean15.92156402
Median Absolute Deviation (MAD)6
Skewness3.54510997
Sum1490338
Variance636.4258806
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02482526.1%
 
256976.0%
 
146554.9%
 
333883.6%
 
433663.5%
 
526922.8%
 
626002.7%
 
723632.5%
 
821872.3%
 
920112.1%
 
Other values (262)3982141.8%
 
ValueCountFrequency (%) 
02482526.1%
 
146554.9%
 
256976.0%
 
333883.6%
 
433663.5%
 
ValueCountFrequency (%) 
3832< 0.1%
 
3811< 0.1%
 
3801< 0.1%
 
3671< 0.1%
 
2893< 0.1%
 

NoshowRate24NBR
Real number (ℝ≥0)

MISSING
ZEROS

Distinct1054
Distinct (%)1.1%
Missing1616
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean0.07126754929
Minimum0
Maximum1
Zeros56215
Zeros (%)59.0%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.1
95-th percentile0.333333
Maximum1
Range1
Interquartile range (IQR)0.1

Descriptive statistics

Standard deviation0.1316014542
Coefficient of variation (CV)1.846583129
Kurtosis11.80492712
Mean0.07126754929
Median Absolute Deviation (MAD)0
Skewness2.919739412
Sum6670.998951
Variance0.01731894274
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
05621559.0%
 
0.216701.8%
 
0.2516511.7%
 
0.33333315681.6%
 
0.16666614961.6%
 
0.14285714211.5%
 
0.12512771.3%
 
0.11111112471.3%
 
0.511121.2%
 
0.110781.1%
 
Other values (1044)2487026.1%
 
(Missing)16161.7%
 
ValueCountFrequency (%) 
05621559.0%
 
0.003611< 0.1%
 
0.0036761< 0.1%
 
0.0061341< 0.1%
 
0.0062111< 0.1%
 
ValueCountFrequency (%) 
13410.4%
 
0.8751< 0.1%
 
0.8571422< 0.1%
 
0.8333339< 0.1%
 
0.8181812< 0.1%
 

EdVisitsNBR
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct45
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.649426072
Minimum0
Maximum155
Zeros67942
Zeros (%)71.4%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum155
Range155
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.271322781
Coefficient of variation (CV)3.497430853
Kurtosis1785.144056
Mean0.649426072
Median Absolute Deviation (MAD)0
Skewness30.76642438
Sum61839
Variance5.158907174
MonotocityNot monotonic
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%) 
06794271.4%
 
11480215.5%
 
257856.1%
 
328633.0%
 
414381.5%
 
57480.8%
 
64160.4%
 
73640.4%
 
82310.2%
 
91470.2%
 
Other values (35)4850.5%
 
ValueCountFrequency (%) 
06794271.4%
 
11480215.5%
 
257856.1%
 
328633.0%
 
414381.5%
 
ValueCountFrequency (%) 
1553< 0.1%
 
1541< 0.1%
 
1492< 0.1%
 
1431< 0.1%
 
1421< 0.1%
 

IpVisitsNBR
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02110878903
Minimum0
Maximum6
Zeros93510
Zeros (%)98.2%
Memory size743.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1720177931
Coefficient of variation (CV)8.149107599
Kurtosis186.7435383
Mean0.02110878903
Median Absolute Deviation (MAD)0
Skewness11.29173591
Sum2010
Variance0.02959012113
MonotocityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
09351098.2%
 
114971.6%
 
21580.2%
 
336< 0.1%
 
414< 0.1%
 
63< 0.1%
 
53< 0.1%
 
ValueCountFrequency (%) 
09351098.2%
 
114971.6%
 
21580.2%
 
336< 0.1%
 
414< 0.1%
 
ValueCountFrequency (%) 
63< 0.1%
 
53< 0.1%
 
414< 0.1%
 
336< 0.1%
 
21580.2%
 

NoShowFLG
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
86115 
1
9106 
ValueCountFrequency (%) 
08611590.4%
 
191069.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size743.9 KiB
0
80000 
1
15221 
ValueCountFrequency (%) 
08000084.0%
 
11522116.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

AppointmentIDPatientIDClinicNMAppointmentDTSAppointmentMonthNBRAppointmentWeekdayNBRAppointmentHourNBRAgeNBRSexFLGHispanicFLGSingleFLGLivesInApartmentFLGEmailFLGApptLagNBRInsuranceDSCHypertensionFLGAsthmaFLGHeartDiseaseFLGObeseFLGDiabetesFLGNoshow24NBRCancellationsNBRLatearrivals24NBRCheckintoCheckoutNBRAppttoCheckoutNBRCheckintoApptNBRArrived24NBRProviders24CNTThatProvider24NBRNoshowRate24NBREdVisitsNBRIpVisitsNBRNoShowFLGCancelledLateFLG
0217251E4/10/18431750F011032Commercial000000.00.01.049.050.0-1.02.02.00.00.00010
1112062A2/7/18241080M00002Medicare100100.00.01.050.049.01.02.01.05.00.00001
2125482A2/8/18251680M00000Medicare100100.01.01.050.049.01.02.01.06.00.00000
3127272A3/8/18351580M000028Medicare100100.01.01.055.048.07.03.01.08.00.00000
4868822A11/9/181161680M00000Medicare100100.01.00.061.047.013.03.01.013.00.00000
5951133A12/28/181261532F01011Commercial000000.00.01.047.045.02.03.03.011.00.00000
6569303A8/3/18861532F010118Commercial000000.00.01.0111.0115.0-3.02.02.013.00.00000
7197044A3/5/18321637F01000Commercial000001.01.01.058.043.08.04.04.011.00.20000
8293515C4/5/18451023F00100Medicaid000000.012.04.075.065.010.027.017.00.00.00000
9922146B12/7/181261433M11002Self Pay000000.00.00.0NaNNaNNaN0.00.00.00.00000

Last rows

AppointmentIDPatientIDClinicNMAppointmentDTSAppointmentMonthNBRAppointmentWeekdayNBRAppointmentHourNBRAgeNBRSexFLGHispanicFLGSingleFLGLivesInApartmentFLGEmailFLGApptLagNBRInsuranceDSCHypertensionFLGAsthmaFLGHeartDiseaseFLGObeseFLGDiabetesFLGNoshow24NBRCancellationsNBRLatearrivals24NBRCheckintoCheckoutNBRAppttoCheckoutNBRCheckintoApptNBRArrived24NBRProviders24CNTThatProvider24NBRNoshowRate24NBREdVisitsNBRIpVisitsNBRNoShowFLGCancelledLateFLG
952118388933469E10/30/181031165F01011Commercial000000.01.01.0129.0124.05.04.03.05.00.0000000001
952128389633469E10/30/181031465F01011Commercial000000.01.01.0129.0124.05.04.03.05.00.0000000000
952139049733469E12/10/18122965F010113Commercial000000.02.01.0111.0104.07.05.03.09.00.0000000000
952142304033470A3/15/18351624M01011Commercial010001.01.02.047.054.0-6.02.01.06.00.3333330000
952155329733471A7/3/18731483F01101Medicare000000.01.00.060.052.08.05.02.016.00.0000000000
952162048333472A3/21/1834738F001015Commercial000000.01.00.052.047.05.04.03.07.00.0000000000
952178151433472A10/19/18106838F00101Commercial000000.02.00.050.044.06.04.02.00.00.0000000010
952188150733472A10/18/181051138F00100Commercial000000.01.00.050.044.06.04.02.00.00.0000000001
95219176333473A6/5/1863840M0101182Commercial100000.03.00.045.032.013.05.01.017.00.0000000000
952204602733473A12/18/18123840M0101196Commercial100000.01.00.053.044.08.04.02.011.00.0000000000